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Complementary DNA (cDNA) libraries were constructed representing the genome RNA of the coronavirus mouse 
hepatitis virus, strain A59 (MHV-A59). From these libraries clones were selected to form a linear map across the entire 
gene A, the putative viral polymerase gene. This gene is approximately 23 kb in length, considerably larger than earlier 
estimates. Sequence analysis of the 5' terminal region of the genome indicates the presence of the 66-nucleotide leader 
that is found on all mRNAs. Secondary structure analysis of the 5' terminal region suggests that transcription of leader 
terminates in the region of nucleotide 66. The sequence of the first 2000 nucleotides is very similar to that reported for 
the closely related JHM strain of MHV and potentially encodes p28, a basic protein thought to be a component of the 
viral polymerase (L. Soe, C. K. Shieh, S. Baker, M. F. Chang, and M. M. C. Lai, 1987, J. Virol., 61, 3968-3976). Gene A 
contains two of the consensus sequences found in intergenic regions. One is adjacent to the 5' leader sequence and 
the other is upstream from the initiation codon for translation of gene B. © 1 989 Academic Press, inc. 


INTRODUCTION 

The genome of the murine coronavirus, mouse hep¬ 
atitis virus, strain A59 (MHV-A59) is a single-stranded 
polyadenylated RNA of positive polarity and is at least 
30 kb in length (see below) (Lai and Stohlman, 1978). 
During infection six capped, polyadenylated subgeno- 
mic messenger RNAs are synthesized in addition to 
full-length positive-sense RNA. The subgenomic RNAs 
form a nested set, all overlapping with the 3' end of ge¬ 
nome RNA and all containing the same 5' 66-nucleo- 
tide "leader" sequence (Lai et at., 1981, 1984; Leibo- 
witz et a!., 1981;Spaan et al., 1982, 1983; Weiss and 
Leibowitz, 1983, see Fig. 1A). A full-length negative- 
stranded RNA serves as a template for the synthesis of 
all virus-specific mRNAs (Lai etal., 1983). The common 
5' leader sequence is transcribed from the 3' end of this 
template and is subsequently used as a primer for 
mRNA synthesis or alternatively spliced co-transcrip- 
tionally to the body of the mRNAs (Baric et al., 1987a; 
Lai etal., 1983, 1984; Spaan et al., 1988). 

The coronavirus polymerase needs to possess many 
activities to initiate and synthesize negative-strand 
RNA, positive-strand genome length RNA and mRNAs 
(both genomic and subgenomic), leader RNA, and 
poly(A). In addition it is likely to possess capping activi¬ 
ties as well as the ability to switch templates during 

1 To whom requests for reprints should be addressed. 


transcription to promote recombination (Lai et al., 
1985). Polymerase activity is membrane associated 
and has been detected in extracts of infected cells 
(Brayton etal., 1982, 1984), but has not yet been puri¬ 
fied. Gene A, the putative polymerase gene is ex¬ 
tremely large and like the avian coronavirus IBV poly¬ 
merase gene has the capacity to encode 700,000 kDa 
in proteins (Boursnell et a/., 1987). To understand both 
leader transcription and the structure of the polymer¬ 
ase, we have cloned the MHV-A59 gene A. We de¬ 
scribe here the complementary DNA (cDNA) clones 
and the analysis of the 5' terminal end of the genome. 

MATERIALS AND METHODS 
Cells and viruses 

MHV-A59 (Manaker et a/., 1961) was grown in 
monolayer cultures of murine fibroblast 17CI-1 cells. 
For preparation of virus-specific RNA, cells were in¬ 
fected at an m.o.i. of 1 and harvested approximately 16 
hr later. 

Preparation and analysis of RNA 

Viral genome RNA was prepared from purified virions 
as described previously (Budzilowicz et al., 1985). In¬ 
tracellular RNA was extracted from the cytoplasm of 
MHV-A59-infected or mock-infected cells as pre¬ 
viously described (Budzilowicz et al., 1985). Electro- 
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phoresis of RNA in the presence of formaldehyde was 
carried out according to the method of Lehrach et al. 
(1977). After electrophoresis, RNA was transferred to 
nitrocellulose and hybridized with radiolabeled probes 
according to the method of Thomas (1980). 

Complementary DNA cloning 

First-strand cDNA synthesis was carried out as de¬ 
scribed previously using genome RNA that had been 
denatured with methyl mercury hydroxide and either 
random oligomers of calf thymus DNA or restriction 
fragments as primers (Budzilowicz etal., 1985). In most 
cases second-strand cDNA was synthesized using 
Escherichia coli DNA polymerase I in the presence of 
RNase FI according to the method of Gubler and Hoff¬ 
man (1983). Oligo(dC) tails were added to the 3' ends of 
double-stranded cDNA molecules. These tailed 
DNAs were annealed to oligo(dG)-tailed Psrl-digested 
pBR322 and used to transform E. coli, strain DH5. In a 
second library, derived from an independent A59 
stock, oligo(dG)-tailed cDNAs were inserted into pUC9 
and transformed into E. coli, strain JM109. The cDNA 
clones containing the genomic leader sequence were 
identified using a leader-specific probe containing four 
tandem copies of the 72 nucleotides at the 5' end of 
A59 genome (pGEMI .L3/4). 

Preparation and analysis of DNA 

DNA was electrophoresed on agarose gels and then 
transferred to nitrocellulose for Southern blot analysis 
(Southern, 1975). DNA was labeled with 32 P by nick- 
translation to a specific activity of >10 8 cpm/^g (Rigby 
etal., 1977). 

DNA sequencing 

DNA restriction fragments as described under Re¬ 
sults were ligated into the polylinker region of 
M13mp19 RF which was used to transform E. coli 
JM103 (Messing et at., 1981). Single-stranded DNA 
from Ml3 subclones was sequenced according to a 
modification of the method of Sanger etal. (1977) using 
[ 35 S]dATP (Biggins et a/., 1983) and T7 DNA polymer¬ 
ase (Sequenase, U.S. Biochemicals) in place of the 
Klenow fragment of polymerase I. Sequencing primers 
were the universal primers (Promega) or synthetic oli¬ 
gonucleotide primers. The synthetic oligonucleotide 
primers which were 20 bases long were synthesized at 
the Wistar Institute or the Chemistry Department at the 
University of Pennsylvania (Philadelphia, PA). Sequenc¬ 
ing reactions were resolved on gradient polyacrylamide 
gels (Biggins et al., 1983). All sequence analysis was 
carried out using the programs of PC Gene on an Ep¬ 


son Equity HIT computer or the Intelligenetics pro¬ 
grams through Bionet. 

RESULTS 

Construction of gene A-specific cDNA clones 

Complementary DNA cloning was carried out using 
random oligomers of calf thymus DNA as primers for 
transcription of cDNA from polyadenylated genome 
RNA. In order to maximize the size of the cDNAs, clon¬ 
ing was carried out according to the method of Gubler 
and Hoffman (1983) in which there is no Si nuclease 
treatment. The average size of the clones obtained was 
approximately 1.5 kb with the largest being 4.1 kb. 

The recombinant plasmid DNAs from 40 of the larg¬ 
est clones were radiolabeled by nick-translation and 
used as probes on Northern blots of RNA from both 
MHV-A59-infected and mock-infected cells. The 
nested set nature of the RNAs allows the mapping of 
clones to the unique regions of the RNAs and thus to 
individual viral genes. DNA from clones mapping to 
gene G hybridize to all seven viral mRNAs since gene 
G sequences are present in all seven mRNAs. Con¬ 
versely, gene A-specific sequences hybridize only to 
mRNA 1 since this is the only viral mRNA that contains 
gene A sequences (Fig. 1). It was important to include 
uninfected cell RNA on these blots because earlier we 
obtained clones that appeared to hybridize to MHV- 
A59 RNA 1 but also hybridized to a band of similar mo¬ 
bility in uninfected cells. Twenty-seven of the 40 cDNA 
clones analyzed this way mapped to gene A. 

Clone 1033 was used to orient the gene A clones. 
Although results from Northern blot analysis indicated 
that cDNA 1033 mapped to gene B (data not shown, 
but hybridization was to mRNAs 1 and 2 as in Fig. 1B, 
lane D), the size of the insert (3600 bp) suggested that 
1033 must also contain sequences from the 3' end of 
gene A since gene B is approximately 2.1 kb (Luytjes 
et al., 1988). In an attempt to more precisely localize 
cDNA 1033 within genes A and B, the two fragments 
derived by digestion with Pst I were radiolabeled and 
used as probes on Northern blots of infected cell RNAs 
(Fig. IB). The 1.3-kb fragment hybridized to mRNA 1 
only while the 2.3 kb fragment hybridized to both 
mRNAs 1 and 2, and must therefore extend into gene 
B. We have confirmed the presence of the gene A/B 
intergenic region by sequencing a portion of the 2.3-kb 
fragment (see below). 

To determine overlaps among the gene A clones, the 
viral inserts were digested with restriction enzymes 
and analyzed by cross-hybridization on Southern blots 
(Southern, 1975). The gene A-specific clones fell into 
clusters separated by gaps. The remaining uncharac- 
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Fig. 1 . Mapping of virus-specific cDNAs to viral mRNAs. (A) The MHV-A59 positive-strand genome, its negative-strand complement, and 
seven mRNAs are shown. (B) RNA was extracted from MHV-A59-infected and mock-infected cells, electrophoresed on agarose gels, and 
transferred to nitrocellulose. DNA fragments were excised from pBR322, labeled by nick-translation, and used as probes on these Northern 
blots. Lane A, mock-infected cell RNA, 1033 1,3-kb fragment; lane B, infected cell RNA, 1033 1,3-kb fragment; lane C, mock-infected cell RNA, 
1033 2.3-kb fragment; lane D, infected cell RNA, 1033 2.3-kb fragment; lane E, mock-infected cell RNA, clone g344 (representing genes D, E, 
F, G); and lane F, infected cell RNA, g344. 


terized clones in the cDNA library were screened by 
hybridization to probes made from cDNAs which 
mapped to regions bordering a gap. This extended the 
lengths of the clusters resulting in three groups of 
clones. These were 1033 to 920,1220 to 917, and 136 
to 1410 (Fig. 2). There remained as well a gap at the 5' 


end of genome; none of the clones hybridized to a 
probe made from the leader sequence-specific clone 
pGEM1.L3/4. From a cloning using a 1200-nucleotide 
fooRI//7/ndlll 917 restriction fragment as primer, we 
selected clone D34 which was contiguous with clone 
136. Clones 67 and 98, obtained from a second ran- 
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Fig. 2. Map of gene A-specific DNA clones. cDNAs were mapped by restriction analysis and Northern and Southern blots as described in the 
text. Only the enzymes Pst\, fooRI, W/ndlll, and SamHI were used across the entire genome. Nucleotide -9589 marks the beginning of the 
intergenic sequence between genes A and B and is 9589 nucleotides from the poly(A) tail of genome RNA. 
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Fig. 3. Sequencing strategy for the 5' end of MHV-A59 genome 
RNA. The viral fragments from clones 077, ZA11,005, and 507B12 
were cleaved with restriction enzymes and subcloned into 
Ml3mp19 RF. Sequencing was carried out using the universal se¬ 
quencing primer in the regions and directions indicated by the ar¬ 
rows. The letters above the arrows designate the enzymes used to 
make the subclones, /teal (R), Alui (A), BamVW (B), Psfl (P), £coRI (E), 
W/ndlll (H). The arrows designated 0 were derived by sequencing 
using synthetic oligonucleotides as primers on 077 DNA. 


domly primed library made from an independent stock 
of MHV-A59, closed the gap between 1220 and 920. 
From a cloning using a 750-nucleotide Ava\/Hin6\\\ 
fragment of 1410, we obtained many leader-positive 
clones, among which were clones 077 and ZA11. 
Clone ZA11 shared homology with 1410 and thus 
completed the cloning of the 5' end. Clones 005 and 
507B12, both located nearthe 5'end of genome, were 
obtained from the second randomly primed library. Of 
all the 5' proximal clones only ZA11 contained the en¬ 
tire leader sequence. The contiguous series of cDNAs 
revealed that gene A was considerably larger than the 
7 kb previously estimated (Lai and Stohlman, 1978; Lei- 
bowitz etal., 1981; Spaan eta/., 1982). The restriction 
digests and Southern blots indicated that the A59 gene 
A is approximately 23 kb, similar to the avian coronavi- 
rus (IBV) gene A (Fig. 2). 

The 5' end of the genome 

The sequence of the 5' 2000 nucleotides of the 
MHV-A59 genome RNA was obtained by sequencing 
of clones 077, ZA11, 005, and 507B12 (Fig. 2). The 
viral fragments were digested with the restriction en¬ 
zymes listed in the legend to Fig. 3 and the fragments 
subcloned into the RF of M13mp19. Sequencing was 
carried out using the dideoxynucleotide chain termina¬ 
tion technique with a universal sequencing primer and 
synthetic oligonucleotide primers (Fig. 3). This 5' leader 
sequence, obtained from ZA11, is the same as that ob¬ 
tained from A59 mRNA(Lai era/., 1984). Atypical inter- 
genic sequence, AAUCUAAAC, is observed at nucleo¬ 
tides 61-71 (Table 1). Downstream of the intergenic 


region, the 5' A59 sequence contains a short open 
reading frame starting at nucleotide 99 and a longer 
one starting at nucleotide 210 continuing past nucleo¬ 
tide 2000 (Fig. 4). The sequence was translated into 
amino acids. The protein predicted from the sequence 
of the open reading frame beginning at nucleotide 210 
is basic and almost identical to the predicted JHM pro¬ 
tein which has been shown to be related to p28, the 
putative polymerase-related polypeptide (Soe et a/., 
1987). The few amino acid differences between the 
strains, most of which are conservative, are shown be¬ 
low the A59 sequence. The JHM sequence contains 
an extra AAUCU repeat at the end of the leader RNA 
compared to A59 (see Fig. 5). 

The secondary structure of the first 150 nucleotides 
of the A59 genome was predicted using the method 
of Zucker and Steigler (1981) and as shown in Fig. 5A 
contains two stems and loops. The structure of the 5' 
terminus is of interest because the leader RNA presum¬ 
ably is transcribed from this region. A similar sequence 
is observed forthis region intheJHM genome (Fig. 5B). 
In this structure, nucleotides 52-69 of A59, containing 
the end of the leader and the leader/gene A intergenic 
regions, are found in the region between the loops and 
therefore may be a likely site for termination of leader 
synthesis as A/U-rich regions following areas of sec¬ 
ondary structure have been associated with termina¬ 
tion sites in several systems (Henikoff eta/., 1983; Mills 
etal., 1980; Zaret and Sherman, 1982). 

Because IBV contains terminal repeats proposed as 
possible polymerase recognition sites, computer anal¬ 
ysis using the Intelligenetics programs for sequence or 
structural homologies were carried out comparing the 
3' end of negative strand (complement of the 5' end of 
plus strand) with the published sequence of the 3' end 
of the MHV-A59 genome (Armstrong et al., 1983). 
There were no significant homologies observed be¬ 
tween the 350 nucleotides at the 3' ends of MHV-A59 
positive and negative strands. 

The 3' end of MHV gene A 

The 3' end of the MHV-A59 gene A was sequenced 
using clone 1033. A 750-nucleotide Sph\/Sac\ piece of 
the 2.3-kb Pst fragment of 1033 was subcloned into 
pGEM4 (Promega) and sequenced from each end us¬ 
ing the universal sequence primers and using a syn¬ 
thetic oligonucleotide primer derived from the se¬ 
quence 50 nucleotides upstream of the intergenic re¬ 
gion found between genes A and B. This intergenic 
sequence, AAUCUAUAA, is followed by the first open 
reading frame of gene B (Luytjes et al., 1988). This 
completes the list of intergenic regions for MHV-A59 
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TABLE 1 

The MHV-A59 Intergenic Regions 


Leader/A' 
Gene A/B 2 
Gene B/C 3 
Gene C/D 4 
Gene D/E 6 
Gene E/F 6 
GeneF/G 6 
Consensus 


AAUCUAAAC 

AAUCUAUAC 

AAUCUAAAC 

AAUCUAAAC 

AAUCUAAAC 

AAUCCAAAC 

AAUCUAAAC 

AAUCUAAAC 


MHV-A59 leader sequence 

10 20 30 40 50 60 

UAUAAGAGUG AUUGGCGUCC GUACCCUUUCA ACUCUAAAAC UCUUGUAGUU UAAAUCUAAU 

70 

CUAAUCUAAA C 6 


1 Sequence obtained from clone ZA11. 

2 Sequence obtained from clone 1033. 

3 Sequence obtained from Luytjes etal. (1987). 

4 Sequence from our unpublished sequence of gene D. 

5 Sequence from Budzilowicz etal. (1985). 

6 The underlined sequence at the end of the leader is that region homologous with the intergenic regions. 


(see Table 1). All intergenic regions contain the se¬ 
quence AAUCUAAAC with the exception of the 1 nu¬ 
cleotide changes in two of the intergenic regions (A/B 
and E/F) indicated by the underlined nucleotides in Ta¬ 
ble 1. 


DISCUSSION 

We have created a restriction map of the unique por¬ 
tion of mRNA 1, the putative polymerase gene. This 
region is bounded by the leader sequence and inter¬ 
genic sequence on the 5' end and an intergenic se¬ 
quence on the 3' end. The data suggest that the entire 
gene is approximately 23 kb. This would suggest that 
the genome is in excess of 30 kb. We cannot know the 
exact size of the genome until the entire sequence is 
obtained. 

Gene A is likely to encode the MHV polymerase be¬ 
cause coronaviruses contain infectious genome RNA 
(Lomniczi, 1977; our unpublished results) and initiate 
infection by the translation of viral polymerase(s) from 
virion RNA. Furthermore, it is likely that translation of 
genome proceeds from the 5' end as is the case with 
other eukaryotic mRNAs. The 5' unique region of the 
avian coronavirus IBV is composed of two large ORFs 
(FI and F2) which potentially encode polypeptides of 
44IK and 300K (Boursnell et al., 1987). It is thought 
that a frameshifting mechanism (Jacks and Varmus, 
1985; Brierly et at., 1 987) results in the synthesis of one 


large polypeptide from the two ORFs. Although the 5' 
regions of MHV-A59 and IBV do not share significant 
homology (Fig. 4) (Boursnell et al., 1987), protein se¬ 
quence homology between the protein predicted by 
the IBV ORF2 and the polypeptide predicted from the 
nucleotide sequence of the 3' portions of MFIV-A59 
gene A has been observed (Bredenbeek et al., manu¬ 
script in preparation). 

A direct identification of the coronavirus polymerase 
is lacking and there is little information about the MHV 
gene A products. Purified MHV virion RNA directs the 
cell-free synthesis of a 250K polypeptide that is pro¬ 
cessed in vitro into 220K and 28K polypeptides (Deni¬ 
son and Perlman, 1986); the 28K polypeptide is derived 
from the amino terminus and has been identified in the 
infected cell (Denison and Perlman, 1987). A basic pro¬ 
tein with the characteristics of the 28K polypeptide is 
encoded in the 5' end of the MHV-JHM and A59 ge¬ 
nome RNAs (Soe et al., 1987). We are currently using 
antisera directed against procaryotic/viral fusion pro¬ 
teins representing various portions of gene A to char¬ 
acterize the viral polypeptides responsible for polymer¬ 
ase activity in infected cells. 

The 5' end sequence of the MHV-A59 genome RNA 
is almost identical to that of the JHM genome (Soe et 
al., 1987). The major difference is an additional AAUCU 
(see Fig. 5) at the end of the JHM leader in the region 
thought to hybridize with template. It is not clear 
whether these differences are meaningful since the 
leader RNAs from each of these strains readily reasso- 
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UAUAACAGUGAUTJCGCCUCCCUACCUACCCUCtJCAACUClIAAAACUCUtJGUAGUUUAAAUCUAAUCUAAACUUUA 75 


UAAACGGCACUUCCUGCGUGUCCAUGCCCGCGGCCajGGUCUUGOCAUAGUGCOGACAUUOGtJAGUUCCUUGACU 150 
MPAGLVLS 
V 

UUCCUUCUCUCCCAGUCACGUGUCCAinJGCCCCCCACCAGCCCACCCAUAGGUUCCAUAAOGGCAAAGAUCGCCA 22 5 

M A X M G 


AAUACGGUCOCGGCUUCAAAUGGGCCCCAGAAUUUCCAUGGAUGCUUCCGAACGCAUCGGAGAAGUUGCGUAACC 300 
xyCLGPXHAFEPPHMLP-NASEKX.CN 


CUCAGAGGUCAGAGCAGGAUGGCUUUUGCCCCUCUGCUGCGCAACAACCGAAACUUAAAGCAAAAACUUUGGUUA 375 
PERSEEDGFCPSAAQEP XVXGKTLV 


AUCACGUGAGGGUGAAUUGUAGCCGGNUUCCAGCUUUGGAGUGCOGUGUUCAGUCUGCCAUAAUCCGUGAUAUUU 450 

nhvrvncsrxpaleccvqsaixrdi 

D L 

UUGUAGAUGAGGAUCCCCAGAAGGUGGAGGCCUCAACUAUGAUGGCAUUGCAGUOCGGUAGOGCCGUCUUGGUUA 525 

fvdbdpqkveasthmalqfgsavlv 


AGCCAUCCAAGCGCUUGUCUAUUCAGGCAUCGACUAAUDUGGGDGUGCUUCCCAAAACAGCUGCCAUGGGGUUCU 600 
XPSKRLS IQAWTNLGVLPXTAAMGL 
VAX P 

UCAAGCGCGUCDGCCUGUGUAACACCAGGGAGUGCUCUUGUGACGCCCACGUGGCCUUUCACCUUUUUACGGUCC 675 

fkrvclcntrecscdahvafhlftv 

F V Q 

AACCCCAUGGUGUAUGCCUGGGUAAUGGCCGUUUUAOAGGCUGGOUCGUUCCAGUCACAGCCAUACCGGAGUAUC 750 
QPDGVCLCNGRFICWFVPVTAIPEV 


CGAAGCAGUGGOUGCAACCCOGCUCCAUCCUUCUUCGUAAGGGUGGUAACAAAGCGUCUGUGACAUCCGGCCACV *25 
AXQHLOPWSILLRXGGNXGSVTSGH 


UCCGCCGCGCUGUOACCAUGCCOGUGUAOGACUUUAAUGUAGAGGAUGCUUGUGAGGAGGUUCAUCUUAACCCGA 900 
FRRAVTHPVYDPNVBDACEEVHLHP 


AGGGOAAGUACDCCDGCAAGCCCUAOGCOCOUCUOAAGGGCUAUCGCGGOGUUAAGCCCAUCCTJGUUUGUGCACC 975 
KCKYSCXAYALLXGYRGVKPXLFVD 
R R 

AGUAUCCOTJCCCACUAUACUGGAUGUCUCCCCAAGCCUCUUCACGACUAUCCCGAUCUCACCWJGAGUCAGAUCA 1050 
QYGCDYTGCLAKGLEDYGDLTLSEM 


AGGAGUUGUTJCCCUGUGUGCGCUGACUCCUUGGAUAOTGAAGOCCtJUGyGGCUUGGCACGUUGAUCGAGAUCCUC 1125 
KELPPVCADSLDSEVLVAHHVDRDP 
S H R H V 

gggcogcoaugcgugoccagacocougcuacdguacguuccadugaouaugugggccaaccg ACCGAGG AUGOGC 1200 
raahrvqtlatvrcidyvgqptedv 

V L S K X M 

UGGAUGGAGAUGOGGUAGUGCGOGAGCCOGCUCAUCUUCUCGCAGCCAAOGCCAUUGDUAAAAGACUCCCCCGUU 1275 
VDGDVVVREPAHLLAANAIVXRLPR 
H P 

UGGUGGAGACOADGCDGUAOACCGAUOCGUCCGOTACAGAAUUCOGUUAUAAAACCAAGCOGUGUGAAUGCGGUU 1350 

lvetklytdssvtefcyxtxlcecg 

D 

UUAUCACGCAGUUUGGCUAUGUGGAUUCUUGOGGUGACACCOGCGAUUUUCGUGGGUGGGUUGCCGGCAAOAUGA 1425 

fitqfgyvdccgotcdfrgwvagnm 

C p 

UCGAUGCCT7OTJCCAUCUCCAGGC0GUACCAAAAAUUAUAUGCCCUGGGAA0UCGAGGCCCAGUCAUCACGUGUUA 1500 

mdcfpcpgctxnykpweleaqssgv 
c s 

UACCAGAAGGAGGOCUUCUAUUCACUCAGAGCACUGAUACAGUCAAUCGOGAGUCCUOUAAGCOCOACGGUCAOG 1575 
IPEGGVLPTQSTDXVNRESFKLYGH 


CUGOUGOGCOTtJUUGGUUCOGCUGDGUAUUGGAGCCCUUGCCCAGGUAOCOGGCUUCCAGOAAUUUGGUCOOCUG 1650 
AVVPFGSAVYHSPCPGMHLPVXHSS 
G A Y 

UUAACOCAUACOCOGC.roDGAaTOAUACAGG AGOAGOUGCUOGOAAGGCAAUUGUUCAACAGACAG ACCCUAOAU 1725 
VXSYSGLTY TaVVCCXAlVQETDAX 
Y 

GUCCUUCUCUGUAOAUGGAUUAUGCCCAGCACAAGUGUGGCAAUCUCGACCAGAGAGCUAUCCUUGGAUUGGACG 1800 
CRSLYKDYVQHXCGNLEQRAXLGLD 
F 

AUCUCUAUCAUACACACUUCCUUCUCAAUACCCCUCACUAUAGUCUCCUCCUUCACAAUGUCCAUUUGUUUGUUA 1875 
DVYHRQLLVHRGDYSLLLENVDLFV 


AGCGGCGCGCOGAAUOUGCUUGCAAAUUCGCCACCUGUGCAGGUGGUCinjGUACCCCOCCOACUAGAUGGUVUAG 1950 
KRRAEFACKPATCGGCLVPLLLDGL 

D 

UGCCCCGCAGUUAVUAUUOGAUUAACAGUGGUCAAGCUUUCACCUCVAUG 2000 

vprsyyliksgqaftsh 


Fig. 4. Sequence of the 5' end of genome RNA. The sequence of 
the 2000 nucleotides and the amino acids encoded at the 5' end of 
MHV-A59 genome RNA are shown. Also shown are amino acids that 
differ between A59 and JHM. 


ciate with templates from either strain (Makino et al., 
1986). The differences in the predicted amino acid se¬ 
quences are mostly conservative and are indicated in 


Fig. 4. Thus the p28 putative polymerase related pro¬ 
tein is conserved between the two strains. 

Since the 3' end of the viral negative strand (the com¬ 
plement of the 5' end of the genome) is likely to be the 
template for transcription of leader RNA, we analyzed 
the secondary structure of this region. Secondary 
structure analysis of the 5' ends of the A59 genome 
RNA showed two stem and loop structures with AG 
= -35.2 keal. This structure is similar to that obtained 
by a similar analysis of the 5' end sequence of the JHM 
genome. This structure is maintained even as more nu¬ 
cleotides (up to 350) are added to the analysis which 
suggests that this is maintained as transcription con¬ 
tinues further. It is not clear how leader is transcribed, 
where its transcription is terminated, and where it is 
released from its template. It seems likely that nucleo¬ 
tide 67 marks the end of the leader transcript as it is 
the first nonhomologous nucleotide between leader 
and template in the gene E/F intergenic region (Table 
1). This is supported by the observation that this is an 
AU-rich region and is found in a region not involved in 
a secondary structure loop (Henikoff et al., 1983; Mills 
et al., 1980; Zaret and Sherman, 1982). If nucleotide 
66 is the end of the leader transcript, then the AAAC 
(nucleotides 68-71) may be important not for primer 
hybridization but perhaps for protein interactions 
needed for initiation of transcription. 

Secondary structure analysis of nucleotides 48-150 
of MHV-JHM led to the hypothesis that the AU-rich re¬ 
gion around nucleotides 75-76 (Fig. 5) may be a site 
where polymerase pauses and marks the end of leader 
transcription (Baric era/., 1987a,b; Shieh etal., 1987). 
This is supported by the presence of leader-related 
transcripts greater than 66 nucleotides in infected cells 
(Baric et al., 1987a; Shieh et al., 1987). However, as 
predicted for MHV-A59 or for JHM, when the nucleo¬ 
tide 1-150 (includes the complete 5' terminus) are ana¬ 
lyzed, nucleotides 75 and 76 are contained within the 
second stem and are not left free. 

The 3' ends of positive- and negative-strand genome- 
size RNA presumably encode important regulatory se¬ 
quences involved in the replication of genome. It was 
of interest therefore to compare the 3' terminal se¬ 
quence of plus- and minus-strand RNAs. Conservation 
of a primary nucleic acid sequence or of a secondary 
structure at the 3' ends of both plus- and minus-strand 
RNAs would suggest that the same or a similar replica¬ 
tion complex recognizes both RNA species. Indeed, 
sequence homology between the 3' ends of both plus- 
and minus-strand RNAs has been identified in the 
related coronavirus, IBV. This sequence is about 60 
nucleotides in length and is located approximately 50 
nucleotides from the 3' ends of both RNA species 
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Fig. 5. Computer prediction of the secondary structure of the 5’ end of genome RNA. The first 150 nucleotides of the MHV A59 and JHM 
genome sequence were analyzed by the program of Zucker and Steigler (1981). Arrowheads are used to mark the differences between the 
sequences of the two strains. (A) MHV-A59 sequence. The loop encompassing nucleotides 1-52 has a AG = -12.4 kcal while the loop encom¬ 
passing nucleotides 69-142 has a AG = - 22.8 kcal. The total AG = -35.2 kcal. Nucleotide 66 marks the last nucleotide that is conserved in 
the leader sequences of all mRNAs and is thus a possible site of termination of leader RNA. Nucleotide 76 is the site proposed by Shieh et al. 
(1987) as a possible termination site for transcription of larger leader transcripts. (B) JHM sequence taken from Soe et al. (1987). The loop 
encompassing nucleotides 1 -52 has a AG = -12.4 kcal while the loop encompassing nucleotides 74-147 has a AG = -19.7 kcal. The total 
AG = -32.1 kcal. The extra AAUCU found in the JHM sequence is underlined. 


(Boursnell et al., 1987). However, this homology was 
not observed in the case of MHV-A59. This is not sur¬ 
prising since in the cases of some other classes of RNA 
viruses, for example, flaviviruses (Brinton and Dispoto, 
1988) and alphaviruses (Levis et al., 1986), the regula¬ 
tory sequences or secondary structures are different 
on plus and minus strands. 
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